Add Roberta converter #2124

omkar-334 · 2025-03-04T06:47:34Z

A few doubts -

the model outputs from keras and huggingface are not similar at all.

from transformers import RobertaTokenizer
from transformers import TFRobertaModel

hf_model = TFRobertaModel.from_pretrained("roberta-base", output_hidden_states=True)
tokenizer = RobertaTokenizer.from_pretrained("roberta-base")

text = "Hello, how are you?"
inputs = tokenizer(text, return_tensors="tf", padding=True, truncation=True)
hf_output = hf_model(**inputs).last_hidden_state
keras_inputs = {
    "token_ids": inputs["input_ids"].numpy(),  # Token IDs
    "padding_mask": inputs["attention_mask"].numpy(),  # Padding Mask
}
keras_output = model(keras_inputs)

Hugging Face’s RoBERTa uses 514 position embeddings (512 positions + 2 extra tokens), whereas Keras only expects 512.
Tokenizer comparison

google-cla · 2025-03-04T06:47:40Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

omkar-334 · 2025-03-04T21:08:35Z

Here's the link to the testing colab - https://colab.research.google.com/github/omkar-334/keras-scripts/blob/main/RoBERTa_converter.ipynb

Also,
RoBERTa doesn't have segment embeddings and pooled layers, but the huggingface model includes a constant segment embedding of dimensions (1, 512) I think. It also included pooled layers for downstream tasks, which the Keras implementation doesn't have.

JyotinderSingh · 2025-03-20T07:54:38Z

Hi @omkar-334. thanks for this PR.
Regarding the mismatched logits, it looks like you're loading the HuggingFace model in float32

hf_model = TFRobertaModel.from_pretrained("roberta-base")

while the Keras model is being quantized into bfloat16.

model = keras_hub.models.RobertaBackbone.from_preset("hf://FacebookAI/roberta-base", dtype="bfloat16")

It might be worth trying to load them in the same precision when verifying the logics.

JyotinderSingh · 2025-03-20T08:08:03Z

I did try to run your notebook by loading in both sets of weights as float32, but the results still don't seem to match.

Huggingface output
 tf.Tensor(
[[[-0.06098365  0.1249077  -0.01024082 ... -0.05549879 -0.05278065
   -0.02032274]
  [-0.33764896  0.20138153  0.07472473 ...  0.16684803  0.02431546
   -0.13936469]
  [-0.02943649  0.23096977  0.18173131 ... -0.14693598 -0.05403079
   -0.02496235]
  ...
  [-0.11631897  0.2576879   0.0894694  ... -0.01494528  0.07766235
    0.03402137]
  [-0.07787547  0.2642327   0.44699728 ... -0.7686613   0.02006039
    0.07307038]
  [-0.0507166   0.14344664 -0.03572293 ... -0.10117416 -0.05277743
   -0.05274259]]], shape=(1, 8, 768), dtype=float32)

Keras output
 tf.Tensor(
[[[-7.23304749e-02  1.11076608e-01 -7.59335235e-04 ... -9.13275555e-02
   -4.67573255e-02 -2.74974313e-02]
  [-1.94556322e-02  7.97019601e-02  1.06528938e-01 ... -2.88743407e-01
   -1.66224763e-02  5.16433269e-02]
  [-7.21454322e-02  1.10889256e-01 -5.06145880e-04 ... -9.06197801e-02
   -4.66767214e-02 -2.69957650e-02]
  ...
  [-3.83148864e-02  1.94189698e-01  2.10571475e-03 ...  6.51391894e-02
   -4.42184880e-03  4.94358130e-02]
  [ 2.18277685e-02  1.65410444e-01  3.22254300e-01 ... -5.11629343e-01
    3.71083468e-02  8.14208537e-02]
  [-6.67822510e-02  1.25953302e-01 -1.97500065e-02 ... -1.38220027e-01
   -4.70701084e-02 -5.49893379e-02]]], shape=(1, 8, 768), dtype=float32)

AssertionError: 
Not equal to tolerance rtol=1e-07, atol=1e-05
...

github-actions · 2025-05-16T02:12:05Z

This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions · 2025-05-30T05:40:17Z

This PR was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

omkar-334 added 4 commits March 4, 2025 10:29

Added RoBERTa converter

ce2e3bc

Add RoBERTa coverter tests

627ad1b

Added RoBERTA in preset_loader

2035ba3

fix key names

269fbd7

divyashreepathihalli requested a review from JyotinderSingh March 19, 2025 05:06

sachinprasadhs added the stat:awaiting response from contributor label May 1, 2025

github-actions bot added the stale label May 16, 2025

github-actions bot closed this May 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Roberta converter #2124

Add Roberta converter #2124

Uh oh!

omkar-334 commented Mar 4, 2025 •

edited

Loading

Uh oh!

google-cla bot commented Mar 4, 2025

Uh oh!

omkar-334 commented Mar 4, 2025

Uh oh!

JyotinderSingh commented Mar 20, 2025

Uh oh!

JyotinderSingh commented Mar 20, 2025

Uh oh!

github-actions bot commented May 16, 2025

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

Uh oh!

Add Roberta converter #2124

Add Roberta converter #2124

Uh oh!

Conversation

omkar-334 commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-cla bot commented Mar 4, 2025

Uh oh!

omkar-334 commented Mar 4, 2025

Uh oh!

JyotinderSingh commented Mar 20, 2025

Uh oh!

JyotinderSingh commented Mar 20, 2025

Uh oh!

github-actions bot commented May 16, 2025

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

Uh oh!

omkar-334 commented Mar 4, 2025 •

edited

Loading